Building Tagged Linguistic Unit Databases for Sentiment Detection

نویسندگان

  • Arno Scharl
  • Albert Weichselbraun
  • Stefan Gindl
چکیده

Despite the obvious business value of visualizing similarities between elements of evolving information spaces and mapping these similarities e.g. onto geospatial reference systems, analysts are often more interested in how the semantic orientation (sentiment) towards an organization, a product or a particular technology is changing over time. Unfortunately, popular methods that process unstructured textual material to detect semantic orientation automatically based on tagged dictionaries [Scharl et al. 2003] are not capable of fulfilling this task, even when coupled with part-of-speech tagging, a standard component of most text processing toolkits that distinguishes grammatical categories such as article (AT), noun (NN), verb (VB), and adverb (RB). Small corpus size, ambiguity and subtle incremental change of tonal expressions between different versions of a document complicate the detection of semantic orientation and often prevent promising algorithms from being incorporated into commercial applications. Parsing grammatical structures, by contrast, outperforms dictionary-based approaches in terms of reliability, but usually suffers from poor scalability due to their computational complexity. This paper addresses this predicament by presenting an alternative approach based on automatically building Tagged Linguistic Unit (TLU) databases to overcome the restrictions of dictionaries with a limited set of tagged tokens.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Evaluation Framework and Adaptive Architecture for Automated Sentiment Detection

Analysts are often interested in how sentiment towards an organization, a product or a particular technology changes over time. Popular methods that process unstructured textual material to automatically detect sentiment based on tagged dictionaries are not capable of fulfilling this task, even when coupled with part-ofspeech tagging, a standard component of most text processing toolkits that d...

متن کامل

Gulf Arabic Linguistic Resource Building for Sentiment Analysis

This paper deals with building linguistic resources for Gulf Arabic, one of the Arabic variations, for sentiment analysis task using machine learning. To our knowledge, no previous works were done for Gulf Arabic sentiment analysis despite the fact that it is present in different online platforms. Hence, the first challenge is the absence of annotated data and sentiment lexicons. To fill this g...

متن کامل

A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...

متن کامل

Czech Subjectivity Lexicon: A Lexical Resource for Czech Polarity Classification

This paper introduces Czech subjectivity lexicon – the new lexical resource for sentiment analysis in Czech. The lexicon is a dictionary of 4947 evaluative items annotated with part of speech and tagged with positive or negative polarity. We describe the method for building the basic vocabulary and the criteria for its manual refinement. Also, we suggest possible enrichment of the fundamental l...

متن کامل

Sentiment analysis of online spoken reviews

This paper describes several experiments in building a sentiment analysis classifier for spoken reviews. We specifically focus on the linguistic component of these reviews, with the goal of understanding the difference in sentiment classification performance when using manual versus automatic transcriptions, as well as the difference between spoken and written reviews. We introduce a novel data...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008